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(54) Cache coherent protocol in which exclusive and modified data is transferred to requesting 
agent from snooping agent 



(57) A system may include two or more agents, at 
least some of which may cache data. In response to a 
read transaction, a caching agent may snoop its cached 
data and provide a response in a response phase of the 
transaction. Particularly, the response may include an 
exclusive indication used to represent both exclusive 



and modified states within that agent. In one embodi- 
ment, the agent responding exclusive may be respon- 
sible for providing the data for a read transaction, and 
may transmit an indication of which of the exclusive or 
modified state that agent had the data in concurrent with 
transmitting the data. 
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Description 

BACKGROUND OF THE INVENTION 

1. Field of the Invention s 

[0001] This invention is related to the field of digital 
systems and, more particularly, to maintaining cache co- 
herency in such systems. 

10 

2. Description of the Related Art 

[0002] A bus is frequently used in digital systems to 
interconnect a variety of devices included in the digital 
system. Generally, one or more devices are connected * 5 
to the bus, and use the bus to communicate with other 
devices connected to the bus. As used herein, the term 
"agent" refers to a device which is capable of commu- 
nicating on the bus. The agent may be a requesting 
agent if the agent is capable of initiating transactions on 20 
the bus and may be a responding agent if the agent is 
capable of responding to a transaction initiated by a re- 
questing agent. A given agent may be capable of being 
both a requesting agent and a responding agent. Addi- 
tionally, a "transaction" is a communication on the bus. 25 
The transaction may include an address transfer and 
optionally a data transfer. Transactions may be read 
transactions (transfers of data from the responding 
agent to the requesting agent) and write transactions 
(transfers of data from the requesting agent to the re- 30 
sponding agent). Transactions may further include var- 
ious coherency commands which may or may not in- 
volve a transfer of data. 

[0003] A feature of many buses is a coherency proto- 
col. The protocol is used by agents to ensure that trans- 35 
actions are performed in a coherent manner. More par- 
ticularly, the coherency protocol is used, when one or 
more agents may cache data corresponding to a mem- 
ory location, to ensure that cached copies and the mem- 
ory location are updated to reflect the effect of various <*o 
transactions which may be performed by various 
agents. 

[0004] In some cases, coherency may be maintained 
via a snooping process in which each agent which may 
cache data may search its caches for a copy of the data 45 
affected by the transaction, as well as the state that the 
copy is in. As used herein, the "state" of a cached copy 
of data may indicate a level of ownership of the data by 
the caching agent. The level of ownership indicates 
what operations are permissible on the cached copy, so 
For example, a read of the cached copy may generally 
be permissible with any level of ownership other than 
no ownership (i.e. no cached copy is stored). A write 
may be permissible for levels of ownership which indi- 
cate that no other cached copies exist. An exemplary 
set of states may be the Modified, Exclusive, Shared, 
and Invalid (MESi) states or the MOESI states (including 
the MESI states and an owned state). Caching agents 



may report, using the coherency protocol, the state of 
the data within that agent. Based on the states reported 
using the coherency protocol, each agent may deter- 
mine the action to take to update its state for the data 
being accessed by the transaction. 
[0005] It is desirable for the state of the cached copy 
to be reported as soon as possible. Delayed reporting 
of the state may result in increased latency for the trans- 
action. Furthermore, the amount of delay from initiating 
the transaction to reporting the state of the data affected 
by the transaction may make the coherency mechanism 
more complex. Unfortunately, it may be difficult to deter- 
mine the exact state of the data quickly. Furthermore, to 
determine the exact state of the data may require intru- 
sive changes to caches within the agent and/or to cir- 
cuitry that interfaces with the caches. 

SUMMARY OF THE INVENTION 

[0006] The problems outlined above are in large part 
solved by a system as described herein. The system 
may include two or more agents, at least some of which 
may cache data. In response to a transaction, a caching 
agent may snoop its cached data and provide a re- 
sponse in a response phase of the transaction. Partic- 
ularly, the response may include an exclusive indication 
used to represent both exclusive and modified states 
within that agent. In one embodiment, the agent re- 
sponding exclusive may be responsible for providing the 
data for a read transaction, and may transmit an indica- 
tion of which of the exclusive ormodified state thatagent 
had the data in concurrent with transmitting the data. 
Thus, the caching agents may defer determining which 
of the exclusive state or the modified state that agent 
has the data in. Snooping hardware may be simplified, 
and may allow for a rapid snoop response. 
[0007] In one embodiment, the bus on which transac- 
tions are transmitted is a split transaction bus in which 
the data bus is separately arbitrated for by the respond- 
ing agent. In the case of an exclusive snoop hit for a 
read, the responding agent may be the agent that re- 
sponded exclusive. Thus, the responding agent may 
control when the data is provided, and thus the agents 
may have flexibility in responding to exclusive snoop 
hits. This flexibility may be used to provide a relatively 
nonintrusive mechanism for fetching data and perform- 
ing snoop updates within the agent. 
[0008] In another embodiment of the system, the 
caching agent may provide a modified response in the 
response phase if the data is in the modified state at the 
time of the snoop (as well as an exclusive response if 
the data is in the exclusive state at the time of the 
snoop), but may provide the data for a read transaction 
if the response is either exclusive or modified. Such an 
implementation may allow for the caching agent to mod- 
ify the data prior to providing the data, even if the data 
is in the exclusive state at the time of the snoop. The 
mechanism for fetching the data within the agent may 
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be made relatively nonintrusive. For example, the mech- 
anism may not block in-flight stores from modifying ex- 
clusive data before the data is fetched from the data 
cache (and the state changed in the data cache), even 
if a response of exclusive has already been given for the s 
transaction. The caching agent may indicate that the da- 
ta is modified when providing the data, if the data was 
modified between the snoop and the transmission of the 
data. 

[0009] Broadly speaking, a system is contemplated. 10 
The system comprises a first agent configured to trans- 
mit an address of a transaction, and a second agent cou- 
pled to receive the address. The second agent is con- 
figured to transmit an indication of a state, within the sec- 
ond agent, of data corresponding to the address. The 15 
indication indicates an exclusive state for both the ex- 
clusive state and a modified state of the data within the 
second agent. 

[001 0] Additionally, a second system is contemplated 
comprising a first agent configured to transmit an ad- 20 
dress of a read transaction , and a second agent coupled 
to receive the address. The second agent is configured 
to provide data corresponding to the address to the first 
agent responsive to the second agent having the data 
in an exclusive state. 25 
[0011] Moreover, an agent is contemplated. The 
agent comprises a storage configured to store state in- 
formation indicative of a state of data stored within the 
agent, and a circuit coupled to the storage and to receive 
an address of a transaction. The circuit is configured to 30 
generate an indication of a state of data corresponding 
to the address responsive to the state information in the 
storage. The indication indicates an exclusive state for 
both an exclusive state within the agent and a modified 
state within the agent. 35 
[0012] Still further, a second agent is contemplated. 
The agent includes a data cache configured to store da- 
ta in a plurality of states including an exclusive state and 
a modified state, and a circuit coupled to the data cache. 
The circuit is configured to retrieve first data from the *o 
data cache and to provide the first data in response to 
a read transaction operating on the first data if the first 
data is in the exclusive state. 

[0013] Furthermore, a method is contemplated. An 
address of a transaction is received in an agent. The 45 
agent responds during a response phase of the trans- 
action with an exclusive state for both the exclusive state 
and a modified state of data corresponding to the ad- 
dress within the agent. 

[0014] Additionally, another method is contemplated, so 
An address of a read transaction is received in a agent. 
Data is transmitted from the agent for the transaction 
responsive to the agent having the data in an exclusive 
state. 

55 

BRIEF DESCRIPTION OF THE DRAWINGS 

[001 5] Other objects and advantages of the invention 



will become apparent upon reading the following de- 
tailed description and upon reference to the accompa- 
nying drawings in which: 

Fig. 1 is a block diagram of one embodiment of a 
system. 

Fig. 2 is a timing diagram of a transaction according 
to one embodiment of the system shown in Fig. 1 . 
Fig. 3 is a block diagram of one embodiment of a 
processor shown in Figs. 1 and 2. 
Fig. 4 is a state diagram of a Exclusive, Shared, and 
Invalid (ESI) coherency protocol. 
Fig. 5 is a state diagram of a Modified, Exclusive, 
Shared, and Invalid (MESI) coherency protocol. 
Fig. 6 is a block diagram of an exemplary pipeline 
which may be employed within one embodiment of 
the processor shown in Fig. 3. 
Fig. 7 is a flowchart illustrating operation of one em- 
bodiment of a bus interface unit shown in Fig. 3. 
Fig. 8 is a flowchart illustrating operation of one em- 
bodiment of a memory system including an L2 
cache and a memory controller shown in Fig. 1 . 
Fig. 9 is a block diagram of one embodiment of a 
carrier medium carrying a database representing 
the system shown in Fig. 1 . 

[0016] While the invention is susceptible to various 
modifications and alternative forms, specific embodi- 
ments thereof are shown by way of example in the draw- 
ings and will herein be described in detail. It should be 
understood, however, that the drawings and detailed de- 
scription thereto are not intended to limit the invention 
to the particular form disclosed, but on the contrary, the 
intention is to cover all modifications, equivalents and 
alternatives falling within the spirit and scope of the 
present invention as defined by the appended claims. 

DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 

[0017] Turning now to Fig. 1, a block diagram of one 
embodiment of a system 10 is shown. Other embodi- 
ments are possible and contemplated. In the embodi- 
ment of Fig. 1 , system 10 includes processors 12A-12B, 
an L2 cache 14, a memory controller 16, a high speed 
input/output (I/O) bridge 18, an I/O bridge 20, and I/O 
interfaces 22A-22B. System 10 may include a bus 24 
for interconnecting the various components of system 
1 0. As illustrated in Fig. 1 , each of processors 12A-12B, 
L2 cache 14, memory controller 16, high speed I/O 
bridge 18 and I/O bridge 20 are coupled to bus 24. I/O 
bridge 20 is coupled to I/O interfaces 22A-22B. L2 cache 
14 is coupled to memory controller 16, which is further 
coupled to a memory 26. 

[0018] Bus 24 may be a split transaction bus in the 
illustrated embodiment. A split transaction bus splits the 
address and data portions of each transaction and al- 
lows the address portion (referred to as the address 
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phase) and the data portion (referred to as the data 
phase) to proceed independently. In the illustrated em- 
bodiment, the address bus and data bus are independ- 
ently arbitrated for, allowing for out of order data phases 
with respect to the corresponding address phases. Each 5 
transaction including both address and data thus in- 
cludes an arbitration for the address bus, an address 
phase, an arbitration for the data bus, and a data phase. 
Additionally, coherent transactions may include a re- 
sponse phase for communicating coherency informa- *c 
tion after the address phase. 

[001 9] Various signals included in bus 24 are illustrat- 
ed in Fig. 1 , including arbitration signals, address phase 
signals, response phase signals, and data phase sig- 
nals. The arbitration signals include a set of address re- '5 
quest signals (A_Req[7:0]) used by each requesting 
agent to arbitrate for the address bus and a set of data 
request signals (D_Req[7:0]) used by each responding 
agent to arbitrate for the data bus. The address phase 
signals include an address bus used to provide the ad- 20 
dress of the transaction (Addr[39:5]) f a command 
(A_CMD[2:0]) used to indicate the transaction to be per- 
formed (read, write, etc.), a transaction ID (A_ID[9:0]) 
used to identify the transaction, and a cache attributes 
(A_L1 CA[1 :0]). More particularly, the transaction ID may 25 
be used for read and write transactions to match the ad- 
dress phase with the subsequent data phase of the 
transaction. A portion of the transaction ID is an agent 
identifier identifying the requesting agent For example, 
the agent identifier may be bits 9:6 of the transaction ID. 30 
Each agent is assigned a different agent identifier. The 
cache attributes may include a cacheability indicator in- 
dicating whether or not the transaction is cacheable 
within the initiating agent and a coherency indicator in- 
dicating whether or not the transaction is to be per- 35 
formed coherently. The response phase signals include 
a set of shared signals (R_SHD[5:0]) and a set of exclu- 
sive signals (R_EXC[5:0]). Each agent which partici- 
pates in coherency may be assigned a corresponding 
one of the set of shared signals and a corresponding *o 
one of the set of exclusive signals. The data phase sig- 
nals include a data bus (Data[255:0]), a transaction ID 
(D_ID[9:0]) similar to the transaction ID of the address 
phase and used to match the address phase with the 
corresponding data phase, a responder ID (D_RSP[3: 45 
0]), and a modified signal (D_Mod). The responder ID 
is the agent identifier of the responding agent who arbi- 
trated for the data bus to perform the data transfer. Ad- 
ditionally, bus 24 includes a clock signal (CLK) which 
carries a clock to which the bus signals are referenced, so 
Both the address phase and the data phase may include 
other signals, as desired, such as the L2 cacheability of 
a transaction in the address phase and data error sig- 
nals in the data phase. 

[0020] Generally, if an agent initiates a coherent 55 
transaction, each agent which participates in coherency 
(a "snooping agenf ) responds to the transaction in the 
response phase. Each snooping agent is assigned a 



shared signal and an exclusive signal, and drives an in- 
dication of the state of the data affected by the transac- 
tion on its assigned signals. For example, in one em- 
bodiment, processors 1 2A-1 2B may be capable of cach- 
ing data in L1 data caches therein. Additionally, I/O 
bridges 18 and 20 may be capable of caching data (e. 
g. caching a cache line into which DMA write data is to 
be merged upon receipt from an I/O device). Thus, each 
of processors 12A-12B and I/O bridges 18 and 20 are 
assigned separate shared and exclusive signals. It is 
noted that, while L2 cache 1 4 is capable of caching data, 
L2 cache 1 4 may be a low latency cache for memory 26 
(as opposed to a cache dedicated to another agent). 
Thus, L2 cache 1 4 may be a part of the memory system 
along with memory controller 1 6 and memory 26. If data 
is stored in L2 cache 14, L2 cache 14 responds to the 
transaction instead of memory controller 16 and thus 
there is no coherency issue between L2 cache 14 and 
memory 26 for this embodiment. 
[0021 ] Each snooping agent determines a state of the 
data affected by the transaction. In one embodiment, for 
example, the MESI states are employed. The modified 
state indicates that no other snooping agent has a copy 
of the data and that the data is modified with respect to 
the copy in the memory system (L2 cache 14 and/or 
memory 26). The exclusive state indicates that no other 
snooping agent has a copy of the data and that the data 
is not modified with respect to the copy in the memory 
system. The shared state indicates that one or more oth- 
er snooping agents may have a copy of the data. The 
invalid state indicates that the snooping agent does not 
have a copy of the data. Other sets of states are possible 
and contemplated, including the MOESI states (which 
include the MESI states as well as an owned state in 
which the data may be shared with one or more other 
agents but may be modified with respect to the copy in 
the memory system and thus may be copied back to the 
memory system when the owning agent evicts the data) 
or any other set of states. Other embodiments may em- 
ploy any suitable subset of the MESI or MOESI states 
(e.g. ESI, MSI, MOSI, etc.). It is noted that the granular- 
ity on which snooping is performed may vary from em- 
bodiment to embodiment. Some embodiments may per- 
form snooping on a cache line granularity, while other 
embodiments may perform snooping on a partial cache 
line (e.g. sector) granularity, or a multiple cache line 
granularity. 

[0022] For an embodiment employing the MESI 
states, an agent signals the invalid state by deasserting 
both the shared and exclusive signals. The agent sig- 
nals the shared state by asserting the shared signal and 
deasserting the exclusive signal. The agent signals the 
exclusive state by asserting the exclusive signal and de- 
asserting the shared signal. If the agent has the data in 
the modified state, the agent also asserts the exclusive 
signal and deasserts the shared signal. Thus, for snoop- 
ing purposes, exclusive and modified may be treated as 
the same state. For an embodiment employing the 
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MOESI states, the owned state may be signalled as ex- 
clusive as well. 

[0023] Since the agent signals the exclusive and mod- 
ified states in the same fashion, and since the modified 
state includes having the data exclusively, the agent 
need not determine the exact state of the data between 
exclusive and modified. Instead, it may be sufficient for 
the agent to determine If it is caching the data and 
whether it is shared or exclusive. Thus, snooping may 
be simplified. 

[0024] An agent having the exclusive or modified 
state for the data affected by a transaction (and thus re- 
sponded in the responsive phase by asserting the ex- 
clusive signal) may be responsible for providing the data 
for that transaction (if the transaction is a read). Thus, 
L2 cache 1 4 and memory controller 1 6 may receive the 
exclusive signals, and may not provide the data for the 
read transaction if an exclusive signal is asserted. The 
agent responding exclusive may retrieve the data from 
its cache or other storage, arbitrate for the data bus, and 
transmit the data as the data phase of the transaction. 
Additionally, concurrent with the transmission of the da- 
ta, the agent may indicate whether the state is exclusive 
or modified using the D_Mod signal. More particularly, 
the D_Mod signal may be asserted to indicate the mod- 
ified state and deasserted to indicate the exclusive 
state. Thus, the correct state of the data (for an exclusive 
response) is communicated to the system during the da- 
ta phase. 

[0025] In one embodiment, bus 24 supports two types 
of read transactions: read non-exclusive and read ex- 
clusive. The term "read transactions" is used to generi- 
cally mean any read, and read non-exclusive and read 
exclusive are used for each type of read. A read non- 
exclusive is a read transaction performed by an agent 
which can accept the data as either shared or exclusive, 
based on the response phase of the transaction. A read 
exclusive transaction is a read transaction which is de- 
fined to result in the requesting agent caching the data 
in an exclusive state. 

[0026] Each requesting agent receives the exclusive 
and shared signals from each snooping agent. Thus, the 
requesting agent for a transaction may determine an ap- 
propriate state for the data received in response to the 
transaction, and may cache the data in that state. For 
example, if the transaction is a read non-exclusive and 
either a shared or exclusive signal is asserted, the data 
may be cached by the requesting agent in the shared 
state. If the transaction is a read non-exclusive and nei- 
ther a shared nor an exclusive signal is asserted, the 
data may be cached by the requesting agent in the ex- 
clusive state. If the transaction is a read exclusive trans- 
action, the data may be cached in the exclusive state 
regardless of the signals. However, the exclusive sig- 
nals may still be used by the memory system to inhibit 
providing data for the transaction if exclusive is sig- 
nalled. 

[0027] In the illustrated embodiment, system 10 em- 



ploys a distributed arbitration scheme, and thus each 
requesting agent is assigned an address request signal 
(one of A_Req[7:0]), and each responding agent is as- 
signed a data request signal <D_Req[7:0]). More partic- 
5 ularly, as mentioned above, each agent is assigned an 
agent identifier and the corresponding address request 
signal and/or data request signal may be used by that 
agent. 

[0028] The fairness scheme implemented by one em- 
io bodiment of system 10 may be one in which the agent 
granted the bus is made lowest priority for being granted 
the bus again. The highest priority agent which is re- 
questing the bus is granted the bus. Since address and 
data buses are separately arbitrated, separate priority 
15 states are maintained for the address and data buses. 
[0029] Each agent may include an address arbiter 
coupled to receive at least the address request signals 
(A_Req[7:0]) corresponding to each other requesting 
agent besides the requesting agent to which that ad- 
20 dress arbiter corresponds (the "corresponding agent"). 
The address arbiter tracks which of the agents are high- 
er priority than the corresponding agent and which 
agents are lower priority than the corresponding agent 
for address bus arbitration. Thus, given the request sig- 
25 rials from each other agent, the address arbiter can de- 
termine whether or not the corresponding agent wins the 
arbitration for the address bus. The address arbiter uses 
the agent identifier (A_ID[9:6]) in the address phase of 
the transaction performed by the arbitration winner to 
30 update the priority state for the corresponding agent. 
More particularly, the agent which won the arbitration is 
marked as lower priority than the corresponding agent. 
On the other hand, if the corresponding agent does win 
the arbitration, the address arbiter updates the priority 
35 state to indicate that each other agent is higher priority 
than the corresponding agent. The data arbiter in each 
responding agent may operate similarly with respect to 
the data request signals (D_Req[7:0]) and the agent 
identifier (D_RSP[3:0J) in the data phase of a transac- 
ts tion. 

[0030] Bus 24 may be pipelined. Bus 24 may employ 
any suitable signalling technique. For example, in one 
embodiment, differential signalling may be used for high 
speed signal transmission. Other embodiments may 
^5 employ any other signalling technique (e.g. TTL, CMOS, 
GTL, HSTL, etc.). 

[0031] Processors 12A-12B may be designed to any 
instruction set architecture, and may execute programs 
written to that instruction set architecture. Exemplary in- 

50 struction set architectures may include the MIPS in- 
struction set architecture (including the MIPS-3D and 
MIPS MDMX application specific extensions), the IA-32 
or IA-64 instruction set architectures developed by Intel 
Corp., the PowerPC instruction set architecture, the Al- 

55 pha instruction set architecture, the ARM instruction set 
architecture: or any other instruction set architecture. 
[0032] L2 cache 14 is a high speed cache memory. 
L2 cache 1 4 is referred to as "L2" since processors 1 2A- 
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12B may employ internal level 1 ("L1") caches. If L1 
caches are not included in processors 12A-12B, 12 
cache 14 may be an L1 cache. Furthermore, if multiple 
levels of caching are included in processors 12A-12B, 
L2 cache 14 may be a lower level cache than L2. L2 
cache 14 may employ any organization, including direct 
mapped, set associative, and fully associative organi- 
zations. In one particular implementation. L2 cache 14 
may be a 51 2 kilobyte. 4 way set associative cache hav- 
ing 32 byte cache lines. A set associative cache is a 
cache arranged into multiple sets, each set comprising 
two or more entries. A portion of the address (the "in- 
dex") is used to select one of the sets (i.e. each encoding 
of the index selects a different set). The entries in the 
selected set are eligible to store the cache line accessed 
by the address. Each of the entries within the set is re- 
ferred to as a "way" of the set. The portion of the address 
remaining after removing the index (and the offset within 
the cache line) is referred to as the "tag", and is stored 
in each entry to identify the cache line in that entry. The 
stored tags are compared to the corresponding tag por- 
tion of the address of a memory transaction to determine 
if the memory transaction hits or misses in the cache, 
and is used to select the way in which the hit is detected 
(if a hit is detected). 

[0033] Memory controller 16 is configured to access 
memory 26 in response to memory transactions re- 
ceived on bus 24. Memory controller 16 receives a hit 
signal from L2 cache 1 4, and if a hit is detected in L2 
cache 14 for a memory transaction, memory controller 
16 does not respond to that memory transaction. If a 
miss is detected by L2 cache 14, or the memory trans- 
action is non-cacheable, memory controller 1 6 may ac- 
cess memory 26 to perform the read or write operation. 
Memory controller 16 may be designed to access any 
of a variety of types of memory. For example, memory 
controller 1 6 may be designed for synchronous dynamic 
random access memory (SDRAM), and more particu- 
larly double data rate (DDR) SDRAM. Alternatively, 
memory controller 1 6 may be designed for DRAM , Ram- 
bus DRAM (RDRAM), SRAM, or any other suitable 
memory device. 

[0034] High speed I/O bridge 1 8 may be an interface 
to a high speed I/O interconnect. For example, high 
speed I/O bridge 18 may implement the Lightning Data 
Transport (LDT) I/O fabric developed by Advanced Mi- 
cro Devices, Inc. Other high speed interfaces may be 
alternatively used. 

[0035] I/O bridge 20 is used to link one or more I/O 
interfaces (e.g. I/O interfaces 22A-22B) to bus 24. I/O 
bridge 20 may serve to reduce the electrical loading on 
bus 24 if more than one I/O interface 22A-22B is bridged 
by I/O bridge 20. Generally, I/O bridge 20 performs 
transactions on bus 24 on behalf of I/O interfaces 22A- 
22B and relays transactions targeted at an I/O interface 
22A-22B from bus 24 to that I/O interface 22A-22B. I/O 
interfaces 22A-22B may be lower bandwidth, higher la- 
tency interfaces. For example, I/O interfaces 22A-22B 



may include one or more serial interfaces, Personal 
Computer Memory Card International Association (PC- 
MCIA) interfaces, Ethernet interfaces (e.g. media ac- 
cess control level interfaces), Peripheral Component In- 
5 terconnect (PCI) interfaces, etc. 

[0036] It is noted that system i 0 (and more particular- 
ly processors 1 2A-1 2B, L2 cache 1 4, memory controller 
16, I/O interfaces 22A-22B, I/O bridge 20, I/O bridge 18 
and bus 24 may be integrated onto a single integrated 

io circuit as a system on a chip configuration. In another 
configuration, memory 26 may be integrated as well. Al- 
ternatively, one or more of the components may be im- 
plemented as separate integrated circuits, or all compo- 
nents may be separate integrated circuits, as desired. 

15 Any level of integration may be used. 

[0037] It is noted that, while the illustrated embodi- 
ment employs a split transaction bus with separate ar- 
bitration for the address and data buses, other embod- 
iments may employ non-split transaction buses arbitrat- 

20 ed with a single arbitration for address and data and/or 
a split transaction bus in which the data bus is not ex- 
plicitly arbitrated. Additionally, other embodiments may 
use a central arbitration scheme instead of a distributed 
arbitration scheme. 

25 [0038] It is noted that, while various bit ranges for sig- 
nals are illustrated in Fig. 1 and other figures below, the 
bit ranges may be varied in other embodiments. The 
number of request signals, the size of the agent identifier 
and transaction ID, the size of the address bus, the size 

30 of the data bus, etc., may all be varied according to de- 
sign choice. 

[0039] It is noted that, while the illustrated embodi- 
ment includes a signal indicating whether transactions 
are coherent or not, other embodiments may treat all 

35 transactions as coherent. Additionally, while the present 
embodiment provides for separate shared and exclu- 
sive signals for each agent capable of caching data, oth- 
er embodiments may employ a single shared signal and 
a single exclusive signal. Each agent capable of caching 

40 data may be coupled to the shared and exclusive signal, 
and may assert the signal as needed to indicate that 
state of the affected data. Furthermore, other embodi- 
ments may used different signal encodings than a 
shared and exclusive signal. 

45 [0040] It is noted that, while the memory system (L2 
cache 1 4 and memory controller 1 6) is described as up- 
dating data which is indicated as modified using the 
DJvlod signal during the data phase, other embodi- 
ments may not have the memory system update the da- 

50 ta. Instead, the requesting agent could cache the data 
in the modified state, if desired. 

[0041] Turning next to Fig. 2, a timing diagram is 
shown illustrating an exemplary read transaction ac- 
cording to one embodiment of bus 24. Other embodi- 
55 ments are possible and contemplated. In Fig. 2, clock 
cycles are delimited by vertical dashed lines. Each clock 
cycle is labeled at the top (0, 1, 2 S 3, and N). The clock 
cycles illustrated in Fig. 2 are periods of the CLK clock 
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signal which clocks bus 24. 

[0042] The phases of the exemplary transaction are 
illustrated in Fig. 2. During clock cycle 0, the requesting 
agent for the transaction participates in an arbitration 
and wins the arbitration. During clock cycle 1, the re- 
questing agent transmits the address phase of the trans- 
action, including the address, command, etc. shown in 
Fig. 1 . The transaction may be indicated to be a coher- 
ent transaction on the A_L1CA[1:0] signals. During 
clock cycle 2, no phases of the transaction occur. During 
clock cycle 3, the response phase of the transaction oc- 
curs. For the exemplary transaction, the snooping agent 
assigned R_SHD[OJ and R_EXC[0] detects an exclusive 
state for the cache line affected by the transaction, and 
thus deasserts the R_SHD[0] signal and asserts the 
R_EXC[0] signal. Since the snooping agent asserted 
the exclusive signal, the snooping agent provides the 
data in the data phase (clock cycle N). Additionally, the 
snooping agent concurrently indicates whetherthe data 
is exclusive or modified during clock cycle IM. In the ex- 
ample, the snooping agent asserts the D_Mod signal, 
indicating that the data is modified. If the data were ex- 
clusive, the snooping agent would deassertthe D_Mod 
signal during clock cycle N. 

[0043] Transactions in which the snooping agent does 
not detect exclusive may be similar, except that the ex- 
clusive signal may be deasserted in clock cycle 3. The 
shared signal may be asserted if the data is in the shared 
state, or deasserted if the data is in the invalid state. 
[0044] As Fig. 2 illustrates, the response phase may 
occur relatively quickly, but the data phase may be de- 
layed by some number of clock cycles (illustrated by the 
ellipsis between clock cycle 3 and clock cycle N). In the 
embodiment of Fig. 1 , in which the data bus is independ- 
ently arbitrated for by the responding agent (the snoop- 
ing agent, in this case), the snooping agent controls 
when the data is supplied. Thus, the snooping agent 
may provide an indication of exclusive, shared, or invalid 
quickly but defer indicating if the exclusive indication is 
either the exclusive state or the modified state. There- 
fore, the snooping agent may defer determining if the 
data is exclusive or modified. This may allow for flexibil- 
ity in the snooping agent. For example, in processor 1 2A 
or 12B, the data may actually be exclusive at the time 
of snooping, but an in-flight store may modify the data 
before the data is fetched from the data cache to be pro- 
vided in the data phase of the transaction. The in-flight 
store may be allowed to complete in this case, since the 
data is in the exclusive state. The subsequent fetching 
of the data from the data cache then fetches the data in 
the modified state, and indicates modified during the da- 
ta phase. Thus, the operation to fetch the data from the 
data cache (and change the data cache's state) may be 
performed in a less intrusive way that might be more 
complex to implement if the exclusive or modified state 
was identified in the response phase. 
[0045] In one embodiment, agents driving a signal 
during a clock cycle drive the signal responsive to the 



rising edge of the clock signal in that clock cycle. Agents 
receiving the signal sample the signal on the falling edge 
of the clock signal. Accordingly, a snooping agent in this 
embodiment samples the address on the falling edge of 

5 the CLK clock signal in clock cycle 1 and drives re- 
sponse signals in clock cycle 3. In other words, the 
snooping agent has 1 1 /2 clock cycles of the CLK clock 
to determine the snoop response. Other embodiments 
may specify different delays from the response phase 

10 to the address phase, including longer and shorter de- 
lays than those shown. 

[0046] In one embodiment, bus 24 is pipelined. Thus, 
a second agent may win arbitration in clock cycle 1 to 
perform a second transaction, present an address 

15 phase of the second transaction in clock cycle 2, and 
have a response phase of the second transaction in a 
clock cycle succeeding clock cycle 3. Similarly, a third 
agent may win arbitration in clock cycle 2 to perform a 
third transaction, etc. In one embodiment, to simplify the 

20 coherency protocol, agents initiating a transaction are 
prohibited from initiating a transaction to the same 
cache line as a currently outstanding transaction which 
has not reached its response phase. Thus, for example, 
the second transaction and third transaction referred to 

25 above may not be to the cache line affected by the illus- 
trated transaction. The more rapidly the response is pro- 
vided, the more rapidly the next transaction affecting 
that cache line may be initiated. Other embodiments 
may allow initiation of transactions to the same cache 

30 line prior to the response phase of a transaction. The 
requesting agent of the first transaction may receive its 
response phase and determine the response for the 
next requesting agent from the response. Even in such 
an embodiment, it may be desirable for the response 

35 phase to be rapid to minimize complexity and the latency 
of each transaction. 

[0047] It is noted that the present discussion refers to 
the assertion and deassertion of various signals. The 
assertion of a signal transmits a first piece of information 

40 (e.g. shared for the R_SHD[5:0] signals, exclusive for 
the R_EXC[5:0J signals, or modified for the D Jvlod sig- 
nal). The deassertion of the signal does not transmit the 
first piece of information. The deassertion may transmit 
a second piece of information (e.g. exclusive for the 

45 D_Mod signal). A signal may be defined to be asserted 
in either the high state or the low state, according to de- 
sign choice, and the signal may be deasserted in the 
other state. Additionally, the signals may be differential 
and either a positive or a negative difference may be 

50 defined to be asserted and the other difference to be 
deasserted. Furthermore, while a modified signal 
(D_Mod) is defined in the illustrated embodiment, an ex- 
clusive signal (D_Exc) could also be used, asserted if 
the data is exclusive and deasserted if the data is mod- 

55 jfjed. 

[0048] Turning now to Fig. 3, a block diagram of one 
embodiment of processor 12A is shown. Other embod- 
iments are possible and contemplated. Processor 12B 
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may be similar. In the embodiment of Fig. 3, processor 
12A includes a processor core 40, a data cache 42, a 
bus interface unit (BIU) 44, a snoop tags 46, and a snoop 
queue 48. Processor core 40 is coupled to BIU 44 and 
to data cache 42, which is further coupled to BIU 44. 5 
BIU 44 is furthercoupled to snoop tags 46, snoop queue 
48, and bus 24. 

[0049] Generally, BIU 44 comprises circuitry for inter- 
facing processor 12A to bus 24, including circuitry for 
handling the coherency aspects of bus 24. More partic- 10 
ularly for the illustrated embodiment, BIU 44 may cap- 
ture transaction information corresponding to coherent 
transactions into snoop queue 48. For example, the ad- 
dress and the transaction identifier may be captured as 
illustrated in Fig. 3. Additional information may be cap- 15 
tured as well, such as the type of transaction. BIU 44 
may access snoop tags 46 to provide a snoop response 
during the response phase of each transaction in snoop 
queue 48, and may provide a snoop operation to proc- 
essor core 40 for insertion into the pipeline or pipelines 20 
which access data cache 42. The snoop operation may 
be used to change state in data cache 42 and/or to fetch 
data from data cache 42 for transmission on bus 24. 
[0050] Data cache 42 may be a high speed storage 
for storing cache lines and tag information including the 25 
address of the cache line and a state of the cache line. 
Snoop tags 46 may be a storage for storing tag informa- 
tion corresponding to data cached in data cache 42, in- 
cluding the addresses corresponding to each cache line 
in data cache 42 and a state of the cache line. However, 30 
snoop tags 46 may not track the entire state used by 
data cache 42 (e.g. the M ESI state). I n one embodiment, 
for example, snoop tags 46 may track the exclusive (E), 
shared (S), and invalid (I) states but not the modified (M) 
state. Transitions between the E, S, and I states gener- 35 
ally involve a transaction on bus 24 while transitions 
from the E state to the M state may be performed without 
a bus transaction. Since snoop tags 46 does not track 
the M state (using its E state to represent both the M 
state and the E state of data cache 42), snoop tags 46 *o 
may be operated at the bus frequency instead of the 
processor core frequency. For example, in one embod- 
iment the processor core 40 and data cache 42 operate 
at twice the frequency of bus 24. Other embodiments 
may use even higher multiples. Therefore, transitions *s 
from exclusive to modified (performed in response to a 
store memory operation by processor core 40 to a cache 
line in the exclusive state within data cache 42) may oc- 
cur at two or more times within each bus clock cycle. 
Thus, tracking the modified state while operating ac- so 
cording to the bus clock cycle may be more complex 
than other states. Tracking the states between which 
transitions occur in response to bus transactions may 
simplify the design of snoop tags 46. 

[0051] While the snoop tags 46 may not exactly track 55 
the state of data cache 42 (referred to as being loosely 
coupled to data cache 42), snoop tags 46 provides 
enough information for BIU 44 to determine a response 



for the response phase of the transaction. Thus, proc- 
essor core 40 and data cache 42 may continue opera- 
tion unimpeded by snooping unless a snoop hit occurs. 
In many types of applications, snoop hits are relatively 
rare. Thus, the interruption of processor core 40 and da- 
ta cache 42 for coherency purposes may be infrequent. 
The interruption may occur when a state change is to 
be performed due to a snoop hit or to fetch data to be 
provided in response to a snoop hit. However, the act of 
snooping may be relatively frequent, and thus using 
snoop tags 46 may prevent the interruption of data 
cache 42 and/or processor core 40 to snoop when no 
snoop hit is going to be detected. A snoop hit is detected 
if the address of a transaction for which the snoop is 
performed is a cache hit in the cache (or other storage, 
in the case of I/O bridges 1 8 and 20) of the snooping 
agent. 

[0052] If BIU 44 detects an exclusive state for a cache 
line affected by a particular read transaction, BIU 44 
may provide a snoop operation to processor core 40 to 
fetch the data from the cache line in data cache 42. Proc- 
essor core 40 may insert the snoop transaction at a con- 
venient point in the pipeline which accesses data cache 
42. An example is shown in Fig. 6 below. BIU 44 may 
receive the data from data cache 42 as well as the ex- 
clusive or modified state of the data, and may arbitrate 
for the data bus portion of bus 24. Upon winning the ar- 
bitration, BIU 44 may drive the data from data cache 42 
as the data for the transaction, and may indicate the ex- 
clusive or modified state of the data on the D_Mod sig- 
nal. The transaction ID for the corresponding transac- 
tion in snoop queue 48 may be used as the transaction 
ID (DJD[9:0]) for the data phase. The snoop operation 
which fetches the data may also cause a state change 
for the data in data cache 42, and the state change may 
be reflected in snoop tags 46 as well. 
[0053] Since the snoop operation to fetch the data is 
inserted at a convenient point in the pipeline, it js pos- 
sible that stores already in-flight in that pipeline may up- 
date the data prior to fetching the data from data cache 
42. However, since the response phase indication of ex- 
clusive includes the modified state as well , it may be per- 
missible for these stores to be performed prior to fetch- 
ing the data and providing the data to BIU 44. Coherency 
of the cache line may still be maintained in this case. 
[0054] While the present embodiment employs snoop 
tags 46 for performing snooping, other embodiments 
may not use snoop tags 46. Instead, data cache 42 may 
include circuitry for performing a snoop. In such an em- 
bodiment, in-flight stores may still be allowed to update 
an exclusive line after the snoop has taken place, and 
the exclusive or modified state may be determined when 
the data is fetched from data cache 42 for transmission 
on bus 24. In other embodiments, the snoop tags 46 
may track the same set of states as data cache 42 (e. 
g. the MESI states). In such an embodiment, agents 
may provide a modified indication in the response 
phase. However, such an embodiment may still allow 
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in-flight stores' to update an exclusive line after the 
snoop has taken place, and the exclusive or modified 
state may be determined when the data is fetched from 
data cache 42 for transmission on bus 24 (e.g. on the 
D_Mod signal) concurrent with the data transfer. 
[0055] If BIU 44 detects a shared state for a cache 
line affected by a particular transaction and that trans- 
action indicates an invalidation of the cache line (e.g. a 
write, a read exclusive, or an invalidate command), BIU 
44 may also transmit a snoop operation to processor 
core 40 for insertion into a pipeline which accesses data 
cache 42. The operation changes the state in data 
cache 42 but may not fetch data for transmission on bus 
24. Similar to the above case, in-flight stores may con- 
tinue progress. 

[0056] As mentioned above, snoop tags 46 stores tag 
information for each cache line in data cache 42. More 
particularly, snoop tags 46 may be a storage comprising 
a plurality of entries, each entry storing tag information 
corresponding to one cache line of data cache 42. The 
entries may be organized in the same fashion as data 
cache 42 (e.g. set associative, direct mapped, fully as- 
sociative, etc.). 

[0057] It is noted that providing responses in the re- 
sponse phase for write transactions may be optional. 
Some embodiments may provide a snoop response for 
write transactions, and other embodiments many not 
provide a snoop response for write transactions. How- 
ever, write transactions may be snooped to cause state 
updates, as illustrated in Figs. 4 and 5 below for the ESI 
and MESI states. 

[0058] Turning next to Fig. 4, a state diagram illustrat- 
ing the ESI states which may be tracked by one embod- 
iment of snoop tags 46 under control of BIU 44 is shown. 
Other embodiments are possible and contemplated. In 
the embodiment of Fig. 4, the invalid (I) state 50, the 
shared (S) state 52, and the exclusive (E) state 54 are 
illustrated. The transitions between each state are illus- 
trated as well. 

[0059] BIU 44 may change the state of a cache line 
from the invalid state 50 to the shared state 52 if proc- 
essor core 40 executes a load to the cache line (which 
misses data cache 42 since the cache line is invalid and 
results in BIU 44 performing a read non-exclusive trans- 
action to the cache line) and either the shared or the 
exclusive response is received from a snooping agent 
by BIU 44 during the response phase of the read non- 
exclusive transaction. BIU 44 may change the state of 
the cache line from the shared state 52 to the invalid 
state 50 responsive to an eviction of the cache line from 
data cache 42 in response to a line fill of another cache 
line or in response to a snoop hit causing an invalidation 
(e.g. a snoop hit due to a write transaction, an invalidate 
command, or a read exclusive transaction initiated by 
another agent). 

[0060] BIU 44 may change the state of the cache line 
from the invalid state to the exclusive state responsive 
to performing a read exclusive transaction on bus 24 



(which results from, e.g., processor core 40 performing 
a store miss to data cache 42) or responsive to perform- 
ing a read non-exclusive transaction (for, e.g., a load 
miss by processor core 40) which receives no shared or 

5 exclusive response in its response phase. BIU 44 may 
change the state of the cache line from the exclusive 
state 54 to the invalid state 50 similar to a transition from 
the shared state 52 to the invalid state 50. 
[0061] BIU 44 may change the state of the cache line 

10 from the shared state 52 to the exclusive state 54 re- 
sponsive to successfully performing an invalidate trans- 
action on bus 24 in response to the processor attempt- 
ing to perform a store to the cache line. BIU 44 may 
change the state of the cache line from the exclusive 

15 state 54 to the shared state 52 responsive to a snoop 
hit by a read non-exclusive transaction initiated by an- 
other agent. 

[0062] While the above description refers to BIU 44 
changing the state of a cache line in snoop tags 46, it is 
20 noted that snoop tags 46 may include the circuitry for 
changing states. 

[0063] Turning next to Fig. 5, a state diagram illustrat- 
ing the MESI states which may be tracked by one em- 
bodiment of data cache 42 is shown. Other embodi- 
es ments are possible and contemplated. In the embodi- 
ment of Fig. 5, the invalid (I) state 50, the shared (S) 
state 52, the exclusive (E) state 56, and the modified 
(M) state 58 are illustrated. The transitions between 
each state are illustrated as well. 
30 [0064] The invalid state 50 and the shared state 52 
may have the same meaning as similarly shown states 
in Fig. 4. However, the exclusive state 54 shown in Fig. 
4 may represent both the exclusive state 56 and the 
modified state 58 illustrated in Fig. 5. The transitions be- 
35 tween the invalid state 50 and the shared state 52 and 
between the invalid state 50 and the exclusive state 56 
may be the same as those shown in Fig. 4 and thus are 
not described again with respect to Fig. 5. Additionally, 
data cache 42 may transition a cache line from modified 
40 state 58 to invalid state 50 in a manner similar to the 
transition of shared state 52 or exclusive state 56 to 
invalid state 50. 

[0065] Data cache 42 may transition a cache line from 
the shared state 52 to the exclusive state 56 responsive 

45 to a successful invalidate transaction on bus 24 by BIU 
44 in response to a store to the cache line. This transition 
may be accompanied by a transition in snoop tags 46 
to exclusive state 54 as illustrated in Fig. 4. Data cache 
42 may subsequently transition a cache line from the 

so exclusive state 56 to modified state 58 responsive to the 
store updating the cache line. 

[0066] Data cache 42 may transition a cache line from 
either the exclusive state 56 or the modified state 58 to 
the shared state 52 responsive to a snoop hit for a read 
55 non-exclusive transaction initiated by another agent. 
This transition may be accompanied by a transition in 
snoop tags 46 from exclusive state 54 to shared state 
52 as illustrated in Fig. 4. 
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[0067] Data cache 42 may transition a cache line from 
exclusive state 56 to modified state 58 in response to a 
store to the cache line. Snoop tags 46 may not be mod- 
ified during this transition, thus remaining in the exclu- 
sive state 54 as illustrated in Fig. 4. 5 
[0068] It is noted that the transition from invalid state 
50 to exclusive state 56 for a read exclusive by BIU 44 
for a store miss to data cache 42 may instead be a direct 
transition from invalid state 50 to modified state 58 . S uch 
a transition may be performed, for example, if the store 
data is merged into the line fill data as it is written to data 
cache 42. 

[0069] It is noted that transitions shown in Fig. 5 re- 
sulting from snoop operations may be performed, in one 
embodiment, in response to snoop operations inserted 
into a cache access pipeline responsive to a snoop hit. 
[0070] It is noted that the invalid state shown in Figs. 
4 and 5 may represent a tag which is stored in data 
cache 42 and snoop tags 46 in the invalid state or a tag 
which misses in data cache 42 and snoop tags 46. Thus, 
for example, a transition to the invalid state for a cache 
line in response to a line fill may physically be a replace- 
ment of the tag in data cache 42 and snoop tags 46 with 
a different tag and a different state. 
[0071] Turning next to Fig. 6, an exemplary pipeline 
60 which may be part of one embodiment of processor 
core 40 is shown. Other embodiments are possible and 
contemplated. In the embodiment of Fig. 6, pipeline 60 
includes a decode state 62, an issue state 64, an ad- 
dress generation stage 66, a translation lookaside buffer 
(TLB) state 68, and a pair of cache access stages 70 
and 72. A mux 74 is inserted between decode stage 62 
and issue stage 64, and a snoop operation may be input 
to mux 74 by BIU 44. BIU 44 may also provide a selec- 
tion control to mux 74. 35 
[0072] Generally, memory access instructions such 
as loads and stores may be decoded in decode state 62 
and may flow through stages 64-72 for execution. In is- 
sue stage 64, the memory access instructions may be 
selected for execution and issued to address generation *o 
stage 66. In address generation state 66, the operands 
of the memory access instructions are added to gener- 
ate a virtual address of the data to be read or written. 
The virtual address may be presented to a TLB in TLB 
state 68 for translation to a physical address, which may 45 
be presented to the data cache 42 for access in stages 
70 and 72. Thus, stages 70 and 72 may be coupled to 
data cache 42. 

[0073] If a snoop operation is initiated by BIU 44, the 
snoop operation may be inserted into pipeline 60 at the 50 
issue stage 64. BIU 44 may provide the operation to mux 
74 and select the operation through mux 74 as a selec- 
tion control. The selection control may also act as a stall 
signal for the decode stage 62, if an instruction is being 
decoded, since the instruction may not pass through 55 
mux 74 to the issue stage 64 if the select signal causes 
the operation from BIU 44 to be selected. The issue 
stage may be a convenient point for insertion in pipeline 



60 since it is the beginning of execution of instructions. 
The snoop operation may be treated like an instruction 
by the remaining pipeline stages. Thus : the snoop op- 
eration may perform its state change to data cache 42 
and/or retrieve data from data cache 42 in the cache 
access stages 70 and 72. The snoop operation includes 
its address, and thus the address generation stage may 
add zero to the address and the address is physical, so 
it may not be translated by the TLB. Other embodiments 
may have pipelines having fewer or greater numbers of 
stages, according to design choice. Furthermore, other 
embodiments may insert the operation from BIU 44 at 
other stages of the pipeline, as desired. 
[0074] Once the operation reaches the end of pipeline 
60, the state change is complete in data cache 42 and 
the data (and its state) is available for BIU 44 (if appli- 
cable). This Information may be passed from data cache 
42 to BIU 44. BIU 44 may update snoop tags 46 and 
provide the data (and its state) on bus 24. 
[0075] Turning now to Fig. 7, a flowchart is shown il- 
lustrating operation of one embodiment of BIU 44 with 
respect to snooping operations. Other embodiments are 
possible and contemplated. While the blocks illustrated 
in Fig. 7 are shown in a particular order for ease of un- 
derstanding, any suitable order may be used. Further- 
more, each of decision blocks 80, 82, and 84 may rep- 
resent independent blocks of circuitry which may oper- 
ate in parallel. Other blocks may be performed in parallel 
as well in the combinatorial logic circuitry of BIU 44. Fur- 
thermore, various blocks may be performed in different 
clock cycles according to the bus protocol and design 
choice within BIU 44. 

[0076] If there is a snoop operation in snoop queue 
48 (decision block 80), BIU 44 reads the snoop tags 46 
(block 86). If the address of the transaction being 
snooped is a snoop hit (decision block 88), BIU 44 may 
optionally (if a state change is to be performed for the 
affected cache line or a data fetch from data cache 42 
is to be performed) generate a snoop operation and in- 
sert it into pipeline 60 (block 90). Additionally, BIU 44 
may determine the response based on the snoop hit in- 
formation for transmission during the response phase 
of the transaction (block 92). If the address of the trans- 
action being snooped is not a snoop hit, the snoop re- 
sponse is invalid. 

[0077] If a snoop operation is completing in pipeline 
60 (decision block 82), BIU 44 may update snoop tags 
46 to reflect the new state of the cache line (thus remain- 
ing consistent with data cache 42) (block 94). Addition- 
ally, if data was fetched from data cache 42 for trans- 
mission on bus 24, BIU 44 may capture the data for 
transmission on bus 24 and may arbitrate for the data 
bus and perform the data phase of the transaction (block 
96). BIU 44 may provide the exclusive or modified state 
of the line from data cache 42 as well, using the D_Mod 
signal. Finally, if data cache 42 evicts a cache line (e.g. 
due to a line fill of another cache line) (decision block 
84), BIU 44 may invalidate the corresponding tag in 
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snoop tags 46 (block 98). If the evicted block is modified, 
BIU 44 may perform a write transaction to write the evict- 
ed block back to the memory system. 
[0078] Turning next to Fig. 8, a flowchart illustrating 
operation of one embodiment of the memory system for s 
a read transaction is shown. Other embodiments are 
possible and contemplated. While the blocks illustrated 
in Fig. 8 are illustrated in a particular order for ease of 
understanding, any suitable order may be used. Fur- 
thermore, blocks may be performed in parallel by vari- 10 
ous circuitry in the memory system. Still further, various 
blocks may be performed in different clock cycles ac- 
cording to the bus protocol and design choice within the 
memory system. 

[0079] If the transaction is a miss in L2 cache 1 4 (de- 1$ 
cision block 100), the memory system determines if the 
transaction is cacheable in L2 cache 14 (decision block 
102). In one embodiment, a signal in the address phase 
of the transaction may indicate whether or not the trans- 
action is L2 cacheable. Other embodiments may define 20 
L2 cacheability in other ways. If the transaction is L2 
cacheable, 12 cache 14 may allocate an L2 cache line 
for the data and may capture the data during the data 
phase (block 104). Memory controller 16 may read the 
data from memory and provide the data if the exclusive 25 
response is not given during the response phase of the 
transaction. 

[0080] If the transaction is not L2 cacheable (decision 
block 102), memory controller 16 may determine if the 
response phase includes the exclusive response (deci- 30 
sion block 106). If the exclusive response is received, 
the memory controller 16 may capture the data if the 
data phase-indicates the data is modified (block 1 08) for 
update into memory 26. Alternatively, the receiving 
agent may receive the data as modified, if desired. If the 35 
data is not modified, memory controller 16 may not up- 
date memory 26. If the exclusive response is not re- 
ceived, memory controller 1 6 may provide the data from 
memory 26 in the data phase of the transaction (block 

110). 40 

[0081] If the transaction is an L2 cache hit (decision 
block 100), L2 cache 14 determines if the exclusive re- 
sponse is received in the response phase of the trans- 
action (decision block 112). If the exclusive response is 
not received, L2 cache 14 provides the data for the 45 
transaction in the data phase (block 1 1 6) If the exclusive 
response is received, L2 cache 14 may update the hit- 
ting cache line with the data corresponding to the trans- 
action if the data is indicated as modified in the data 
phase via the D_Mod signal (block 114). If the data is so 
not indicated as modified, L2 cache 14 may not update 
the cache line. 

[0082] It is noted that L2 cache 14 is an optional part 
of the memory system. A memory system not including 
12 cache 14 may be represented by blocks 106, 108, 55 
and 110. It is further noted that, for write transactions, 
the memory system may capture the data irrespective 
of receiving an exclusive response in the response 



phase of the write transaction. 

[0083] Turning next to Fig. 9, a block diagram of a car- 
rier medium 120 including a database representative of 
system 10 is shown. Generally speaking, a carrier me- 
dium may include storage media such as magnetic or 
optical media, e.g., disk or CD-ROM, volatile or non-vol- 
atile memory media such as RAM (e.g. SDRAM, 
RDRAM, SRAM, etc.), ROM, etc., as well as transmis- 
sion media or signals such as electrical, electromagnet- 
ic, or digital signals, conveyed via a communication me- 
dium such as a network and/or a wireless link. 
[0084] Generally, the database of system 10 carried 
on carrier medium 120 may be a database which can 
be read by a program and used, directly or indirectly, to 
fabricate the hardware comprising system 10. For ex- 
ample, the database may be a behavioral-level descrip- 
tion or register-transfer level (RTL) description of the 
hardware functionality in a high level design language 
(HDL) such as Veriiog or VHDL. The description may be 
read by a synthesis tool which may synthesize the de- 
scription to produce a netlist comprising a list of gates 
in a synthesis library. The netlist comprises a set of 
gates which also represent the functionality of the hard- 
ware comprising system 1 0. The netlist may then be 
placed and routed to produce a data set describing ge- 
ometric shapes to be applied to masks. The masks may 
then be used in various semiconductor fabrication steps 
to produce a semiconductor circuit or circuits corre- 
sponding to system 10. Alternatively, the database on 
carrier medium 1 20 may be the netlist (with or without 
the synthesis library) or the data set, as desired. 
[0085] While carrier medium 120 carries a represen- 
tation of system 1 0, other embodiments may carry a rep- 
resentation of any portion of system 10, as desired, in- 
cluding any set of one or more agents (e.g. processors, 
L2 cache, memory controller, etc.) or circuitry therein (e. 
g. BlUs, caches, tags, etc.), etc. 

[0086] The databases described above may com- 
prise a circuit defining mechanism for the system 1 0 or 
portions thereof. 

[0087] Numerous variations and modifications will be- 
come apparent to those skilled in the art once the above 
disclosure is fully appreciated. It is intended that the fol- 
lowing claims be interpreted to embrace all such varia- 
tions and modifications. 



Claims 

1. An apparatus comprising: 

a circuit coupled to receive an address, said cir- 
cuit configured to transmit an indication of a 
state, within said circuit, of data corresponding 
to said address, said indication indicating an 
exclusive state forboth said exclusive state and 
a modified state of said data within said circuit. 
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4. 



The apparatus as recited in claim 1 wherein said 
address is part of a read and said circuit is config- 
ured to provide said data for said read responsive 
to said circuit indicating exclusive via said indica- 
tion. 

The apparatus as recited in claim 2 further compris- 
ing a cache, wherein said cache is configured to 
provide said data responsive to said address hitting 
in said cache and said indication not indicating said 
exclusive state. 

The apparatus as recited in claim 3 further compris- 
ing a memory controller, wherein said memory con- 
troller is configured to provide said data responsive 
to said address missing in said cache and said in- 
dication not indicating said exclusive state. 



10. The apparatus as recited in claim 9 further compris- 
ing a pipeline, wherein a first stage of said pipeline 
is coupled to said cache, and wherein said circuit is 
configured to insert an operation to fetch said data 

5 from said cache in a second stage of said pipeline 
prior to said first stage in response to said snoop 
tags indicating said exclusive state for said data. 

11. The apparatus as recited in claim 10 wherein said 
10 cache is configured to update to a new state for said 

data responsive to said operation being in said first 
stage. 

12. The apparatus as recited in claim 11 wherein said 
15 snoop tags are updated to said new state for said 

data responsive to said operation being in said first 
stage. 



5. The apparatus as recited in claim 2 wherein said 
circuit is configured to transmit a second indication 20 
indicating whether said data is in said exclusive 
state or said modified state concurrent with provid- 
ing said data for said read. 

6. The apparatus as recited in claim 5 further compris- 25 
ing a cache coupled to receive said second indica- 
tion, wherein said cache is configured to update a 
cache line storing said data responsive to said ad- 
dress hitting in said cache and said second indica- 
tion indicating that said data is in said modified 30 
state. 

7. The apparatus as recited in claim 6 further compris- 
ing a memory controller coupled to receive said sec- 
ond indication, wherein said memory controller is 35 
configured to update a memory to which said mem- 
ory controller is coupled with said data responsive 

to said address missing in said cache, said address 
being not cacheable in said cache, and said second 
indication indicating that said data is in said modi- <*o 
tied state. 

8. The apparatus as recited in claim 7 wherein said 
cache and said memory controller do not update 
with said data responsive to said second indication 
indicates that said data is in said exclusive state. 

9. The apparatus as recited in claim 1 wherein said 
circuit comprises a cache and a snoop tags storage, 
wherein said snoop tags storage is configured to so 
maintain said exclusive state even if corresponding 
data is modified, and wherein said circuit is config- 
ured to snoop said snoop tags storage in response 

to said address, and wherein said cache is config- 
ured to maintain a modified state and an exclusive 55 
state if said data is held exclusively, depending on 
whether or not said data is modified. . 



13. A circuit defining mechanism comprising one or 
more databases representing the apparatus as re- 
cited in any of claims 1 -12. 

1 4. A carrier medium carrying the circuit defining mech- 
anism as recited in claim 13. 

15. A method comprising: 

receiving an address of a read; and 
responding to said address during a response 
phase of said read with an exclusive state for 
both said exclusive state and a modified state 
of data corresponding to said address. 

16. The method as recited in claim 15 further compris- 
ing transmitting data for said read. 

17. The method as recited in claim 16 further compris- 
ing transmitting which one of said exclusive state or 
said modified state corresponds to said data con- 
current with said transmitting data. 

18. The method as recited in claim 17 further compris- 
ing storing said data in a cache responsive to said 
address hitting in said cache and said transmitting 
said modified state. 

19. The method as recited in claim 17 further compris- 
ing storing said data in a memory responsive to said 
address missing in said cache, being not cacheable 
in said cache, and said transmitting said modified 
state. 

20. An apparatus comprising: 

a circuit coupled to receive an address of a 
read, said circuit configured to provide data cor- 
responding to said read responsive to said cir- 
cuit having said data in an exclusive state. 
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21 . The apparatus as recited in claim 20 wherein said 
circuit is further configured to provide said data for 
said read responsive to said circuit having said data 
in a modified state. 

22. The apparatus as recited in claim 21 wherein said 
circuit is configured to signal which of said exclusive 
state or said modified state said circuit has said data 
in concurrent with providing said data. 

23. The apparatus as recited in claim 21 wherein said 
circuit, during a response phase of said read, is con- 
figured to indicate said exclusive state for said data 
independent of which of said exclusive state and 
said modified state said circuit has said data in. 

24. The apparatus as recited in claim 23 wherein said 
circuit is configured to snoop responsive to said ad- 
dress to indicate said exclusive state during said re- 
sponse phase, and wherein circuit does not deter- 20 
mine if said data is in said modified state during 
snooping. 

25. A circuit defining mechanism comprising one or 
more databases representing the apparatus as re- 25 
cited in any of claims 20-25. 

26. A carrier medium carrying the circuit defining mech- 
anism as recited in claim 25. 

30 

27. A method comprising: 

receiving an address of a read; and 
transmitting data for said read transaction re- 
sponsive to having said data in an exclusive 35 
state. 

28. The method as recited in claim 27 further compris- 
ing transmitting data for said read responsive to 
having said data in a modified state. 40 

29. The method as recited in claim 28 further compris- 
ing transmitting which one of said exclusive state or 
said modified state corresponds to said data con- 
current with said transmitting data. 45 

30. The method as recited in claim 27 further compris- 
ing responding during a response phase of said 
read with an exclusive state for both said exclusive 
state and a modified state of data corresponding to so 
said address. 
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